Big Learning with Little RAM
نویسندگان
چکیده
In large-scale machine learning, available memory (RAM) is often a key constraint, both during model training and when making new predictions. In this paper, we reduce memory cost by projecting our weight vector β ∈ R onto a coarse discrete set using randomized rounding. Because the values of the discrete set can be stored more compactly than standard 32-bit float encodings, this reduces RAM usage by 50% during training and by up 90% at prediction time. Theoretical analysis provides safety guarantees that bound the regret added by this projection. Empirical evaluation confirms excellent results in practice, adding only an additional 0.01% to logistic loss in testing.
منابع مشابه
The Gamma Operator for Big Data Summarization on an Array DBMS
SciDB is a parallel array DBMS that provides multidimensional arrays, a query language and basic ACID properties. In this paper, we introduce a summarization matrix operator that computes sufficient statistics in one pass and in parallel on an array DBMS. Such sufficient statistics benefit a big family of statistical and machine learning models, including PCA, linear regression and variable sel...
متن کاملLarge-Scale Learning with Less RAM via Randomization
We reduce the memory footprint of popular large-scale online learning methods by projecting our weight vector onto a coarse discrete set using randomized rounding. Compared to standard 32-bit float encodings, this reduces RAM usage by more than 50% during training and by up to 95% when making predictions from a fixed model, with almost no loss in accuracy. We also show that randomized counting ...
متن کاملOn the Interrelationships among Undergraduate English Foreign Language Learners’ Speaking Ability, Personality Traits, and Learning Styles
The vital role individual differences, such as personality variation, play has long been discussed as the origin of different learning abilities. Accordingly, a cross-sectional survey and a descriptive study was conducted. Data was gathered from a sample of 150 students of both genders (107 females and 43 males) with an age range of 19-22. The translated and validated versions of the Big Five p...
متن کاملCultural Components and Subcomponents in Two Persian and English Language Teaching Textbooks: A Comparative Study
The present qualitative research, for the first time, aimed at comparing and contrasting the extent cultural components and subcomponents are represented in the elementary levels of A Course in General Persian and Top Notch Series as foreign language teaching textbooks. The adapted checklist of Lee's Big ‘C' and little ‘c' cultural components (2009) was used for the current study. After conten...
متن کاملUsing Big Data for Predicting Freshmen Retention
Traditional research in student retention is survey-based, relying on data collected from questionnaires, which is not optimal for proactive prediction and real-time decision (student intervention) support. Machine learning approaches have their own limitations. Therefore, in this research, we propose a big data approach to formulating a predictive model. We used commonly available (student dem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012